ReHap: An Integrated System for the Haplotype Assembly Problem from Shotgun Sequencing Data

نویسندگان

  • Filippo Geraci
  • Marco Pellegrini
چکیده

Single nucleotide polymorphism (SNP) is the most common form of DNA variation. The set of SNPs present in a chromosome (called the haplotype) is of interest in a wide area of applications in molecular biology and biomedicine. Personalized haplotyping of (portions of/all) the chromosomes of individuals is one of the most promising basic ingredients leading to effective personalized medicine (including diagnosis, and eventually therapy). Personalized haplotyping is getting now technically and economically feasible via steady progress in shotguns sequencing technologies (see e.g. the 1000 genomes project A deep catalogue of human genetic variations). One key algorithmic problem in this process is to solve the haplotype assembly problem, (also known as the single individual haplotyping problem), which is the problem of reconstructing the two haplotype strings (paternal and maternal) using the large collection of short fragments produced by the PCR-based shotgun technology. Although many algorithms for this problem have been proposed in the literature there has been little progress on the task of comparing them on a common basis and on providing support for selecting the best algorithm for the type of fragments generated by a specific experiment. In this paper we present ReHap, an easy-to-use AJAX based web tool that provides a complete experimental environment for comparing five different assembly algorithms under a variety of parameters setting, taking as input user generated data and/or providing several fragment-generation simulation tools. This is the first published report of a comparison among five different haplotype assembly algorithms on a common data and algorithmic framework. This system can be used by researchers freely at the url: http://bioalgo.iit.cnr.it/rehap/.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A comparison of several algorithms for the single individual SNP haplotyping reconstruction problem

MOTIVATION Single nucleotide polymorphisms are the most common form of variation in human DNA, and are involved in many research fields, from molecular biology to medical therapy. The technological opportunity to deal with long DNA sequences using shotgun sequencing has raised the problem of fragment recombination. In this regard, Single Individual Haplotyping (SIH) problem has received conside...

متن کامل

Theory and Algorithms for the Haplotype Assembly Problem∗

Genome sequencing studies to date have generally sought to assemble consensus genomes by merging sequence contributions from multiple homologous copies of each chromosome. With growing interest in genetic variations, however, there is a need for methods to separate these distinct contributions and assess how individual homologous chromosome copies differ from one another. An approach to this pr...

متن کامل

Optimisation of assembly scheduling in VCIM systems using genetic algorithm

Assembly plays an important role in any production system as it constitutes a significant portion of the lead time and cost of a product. Virtual computer-integrated manufacturing (VCIM) system is a modern production system being conceptually developed to extend the application of traditional computer-integrated manufacturing (CIM) system to global level. Assembly scheduling in VCIM systems is ...

متن کامل

Accuracy Assessment of Consensus Sequence from Shotgun Sequencing

The significance of any genetic or biological implication based on DNA sequencing depends on its accuracy. The statistical evaluation of accuracy requires a probabilistic model of measurement error. In this chapter, we describe two statistical models of sequence assembly from shotgun sequencing respectively for the cases of haploid and diploid target genome. The first model allows us to convert...

متن کامل

Sequencing Mixed Model Assembly Line Problem to Minimize Line Stoppages Cost by a Modified Simulated Annealing Algorithm Based on Cloud Theory

This research presents a new application of the cloud theory-based simulated annealing algorithm to solve mixed model assembly line sequencing problems where line stoppage cost is expected to be optimized. This objective is highly significant in mixed model assembly line sequencing problems based on just-in-time production system. Moreover, this type of problem is NP-hard and solving this probl...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010